Description: Lucene是apache软件基金会[4] jakarta项目组的一个子项目,是一个开放源代码[5]的全文检索引擎工具包,即它不是一个完整的全文检索引擎,而是一个全文检索引擎的架构,提供了完整的查询引擎和索引引擎,部分文本分析引擎(英文与德文两种西方语言)。Lucene的目的是为软件开发人员提供一个简单易用的工具包,以方便的在目标系统中实现全文检索的功能,或者是以此为基础建立起完整的全文检索引擎。
-Lucene is the apache Software Foundation [4] jakarta project, a sub-group is an open-source [5] of the full-text search engine tool kit, that is, it is not a complete full-text search engine, but a framework for full-text search engine to provide a complete query engine and index engine, part of text analysis engine (in English and two western German language). Lucene is designed for software developers to provide an easy-to-use toolkit to facilitate the achievement of the target system in the full-text search functions, or as the basis for the establishment of a complete full-text search engine. Platform: |
Size: 2320384 |
Author:道哥 |
Hits:
Description: b树,用于空间搜索引擎,非常实用,特别好-b tree, for the space search engine, very useful, especially good Platform: |
Size: 196608 |
Author:li xiaowen |
Hits:
Description: 文本相似度计算余弦相似度代码,计算文本相似度,用于搜索引擎-Cosine similarity of text similarity computation code, the text of the similarity calculation for the search engine Platform: |
Size: 5120 |
Author:li xiaowen |
Hits:
Description: 一个简单的搜索引擎,采用倒排表,对文件进行索引,很据内容进行匹配-A simple search engine, the use of inverted tables, index files, it is according to match the content Platform: |
Size: 944128 |
Author:tstao |
Hits:
Description: imdict-chinese-analyzer 是 imdict智能词典 的智能中文分词模块,算法基于隐马尔科夫模型(Hidden Markov Model, HMM),是中国科学院计算技术研究所的ictclas中文分词程序的重新实现(基于Java),可以直接为lucene搜索引擎提供简体中文分词支持。-imdict-chinese-analyzer is a smart imdict Chinese Dictionary smart module segmentation algorithm based on Hidden Markov Model (Hidden Markov Model, HMM), the Chinese Academy of Sciences Institute of Computing Technology of Chinese word segmentation ictclas process re-implement (based on Java ), can be directly provided for the lucene search engine support for Simplified Chinese word segmentation. Platform: |
Size: 3256320 |
Author:王同 |
Hits:
Description: Java实现搜索引擎代码实现,采用了java编程技术,实现搜索网页链接-Java code to achieve the search engine using the java programming technology, the realization of the search page link Platform: |
Size: 162816 |
Author:fu |
Hits:
Description: 使用java实现的搜索引擎。适合初学者学习使用-Implemented using the java search engine. Suitable for beginners learning to use Platform: |
Size: 958464 |
Author:sun |
Hits:
Description: 尚学堂的一个很不错的搜索引擎开发案例,内有详细开发文档及源码.-The school is still a very good search engine development case, which detailed the development documentation and source code. Platform: |
Size: 9249792 |
Author:石平阳 |
Hits:
Description: Java毕业论文:搜索引擎系统附源代码.这是一套拿去就能用的Java毕业论文资料,内含JAVA技术实现搜索引擎的源代码、技术文档、编译JAR文件,如果你想偷懒的话,拿去就能用了。尤其是文档写的很详细,源代码也比较完整,不过环境配置比较麻烦。-Java Thesis: Search engine system attached to the source code. This is a set of Naqu be able to use the Java thesis information technology includes search engines, JAVA source code, technical documentation, compile JAR file, if you want to goof off, then take Can the overhauled. In particular, the document written in great detail, the source code is also more complete, but the environment configuration is cumbersome. Platform: |
Size: 961536 |
Author:286 |
Hits:
Description: lucene htmlparser paoding customSpider webservice 一个完整的基于lucene工具包和庖丁分词加自定义实现爬虫分析数据的搜索引擎,少量改动即可使用-lucene htmlparser paoding customSpider webservice a complete tool kits and Paoding lucene-based word plus a custom analysis of data to achieve a search engine crawler Platform: |
Size: 44039168 |
Author:zhangming |
Hits:
Description: 要构建搜索引擎,首先要收集各个FTP站点的文件信息,记录到数据库,用于提供搜索。因特 网中有许多的FTP站点,要收集某个FTP站点的信息时,从数据表中读出站点信息,然后登陆到此站点,多数FTP服务器都开辟有一个公共访问区,称为"匿 名FTP",对公众提供免费的文件信息服务,一般用户名为 Anonymous,密码为一个Email地址。数据采集程序用此用户名和密码登陆站点,然后对站点所有目录进行采集,读取每个目录下的文件信息,在收到 文件信息之后,对其进行分析,将文件信息存储到相应的数据表字段中。完成此站点的数据采集之后,再读取另外一个FTP站点的信息,进行文件信息采集。如此 循环,采集所有已知FTP站点的文件信息。-To build a search engine, we must first collect the FTP site file information, record to the database, to provide search. There are many Internet FTP site to a FTP site to collect information, data tables read from the site information, then visit this site, most FTP servers are opening up a public access area, known as "anonymous FTP" of the documents available to the public free information service, the average user named Anonymous, the password for an Email address. Data acquisition program with this user name and password login site, and then all the directory site acquisition, reading each directory file information, after receiving the file information, analyze it, will be stored into the appropriate file information table field. Completed after data collection for this site, and then read another FTP site information, to file information collection. This cycle, collecting all known FTP site file information. Platform: |
Size: 1024 |
Author:yangyanmei |
Hits:
Description: google所用的网页排名算法。现在几乎被全球的搜索引擎学习者学习。-google page rank algorithm used. Now almost learners around the world search engine. Platform: |
Size: 168960 |
Author:bill |
Hits:
Description: 学习Java搜索引擎,不妨看看这个,有助你的Java水平提高-Learning Java search engine, take a look at this, will help raise the level of your Java Platform: |
Size: 961536 |
Author:曾育高 |
Hits: